Apparatus for analyzing cardiac events

ABSTRACT

An apparatus for detecting cardiac events in electrograms has a feature extraction unit that derives features of the cardiac events for discriminating among different types of cardiac events. A clustering unit groups cardiac events with similar features into respective clusters defined by predetermined cluster features. The feature extraction unit determines a feature vector describing waveform characteristics of cardiac events in the electrogram by a wavelet transform. The clustering unit determines the distance between the feature vector and corresponding cluster feature vectors in order to assign the cardiac event in question to that cluster which results in a minimum distance, providing that the minimum distance is less than a predetermined threshold value. A heart stimulator provided with such a cardiac event detecting apparatus detects the occurrence of an arrhythmia and appropriately controls a pulse stimulator to treat the arrhythmia.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to an apparatus for cardiac events detected in electrograms, EGMs, and to a heart stimulator provided with such an apparatus.

In the following the expression “cardiac event” denotes the depolarization phase in the cardiac cycle, which for atrial signals is commonly known as P wave and for ventricular signals as R wave or QRS complex.

2. Description of the Prior Art

In the field of devices for cardiac rhythm management (CRM), accurate rhythm classification is an increasingly important aspect. Pacemakers are primarily used to assist in bradycardia or when the electrical propagation path is blocked, whereas the primary use of implantable cardioverter defibrillators (ICD) is to terminate ventricular arrhythmia, a life-threatening condition if not immediately treated. In both types of devices, accurate event classification of the electrogram signal is needed for identifying, e.g., atrial and ventricular fibrillation in order to give appropriate therapy for the detected arrhythmia. For pacemakers, this may necessitate changing the pacing mode in order to stabilize the ventricular rhythm during an episode of atrial fibrillation. An ICD responds to ventricular fibrillation by giving a defibrillating shock intended to terminate the fibrillation.

Ever increasing demands are put on both kinds of devices to better handle their primary task as well as to manage other tasks than those originally intended for. One such task may be, for an implantable medical device, to identify atrial flutter in order to terminate it by atrial pacing or to defibrillate atrial fibrillation. Although it is not a life-threatening arrhythmia, atrial fibrillation is an inconvenience to the patient and increases the risk for other diseases such as stroke. Atrial pacing may also be one way of terminating supraventricular tachycardias. An ICD specific task is to identify atrial fibrillation in order to not mistake it for ventricular fibrillation and the risk of giving an unnecessary, and possibly harmful, defibrillation shock. Another, more general, utilization is to efficiently store rhythm data for later analysis and evaluation, already done in modern ICD's. By collecting data, better knowledge of the evolution of cardiac diseases and the functionality of the device can be obtained.

Clustering represents an important task within the classification problem where each individual event is assigned to a cluster of events with similar features. Labelling of the clusters, i.e., associating the cluster with a specific cardiac rhythm, completes the classification such that the device can provide proper therapy when needed. However, certain constraints distinguish clustering in CRM devices from clustering in general. In order to give immediate therapy, it requires clustering to be done in real-time, thus excluding many iterative clustering algorithms such as k-means clustering and competitive learning. Various methods have recently been presented concerning clustering of signals from the surface electrocardiogram (ECG), based on, e.g., self-organizing maps or fuzzy hybrid neural networks, see M. Lagerholm, C. Peterson, G. Braccini, L. Edenbrabdt, and L. Sörnmo, Clustering ECG complexes using Hermite functions and self-organizing maps”, IEEE Trans. Biomed. Eng., vol. 47, pp. 838-848, July 2000, and S. Osowski and T. Linh, “ECG beat recognition using fuzzy hybrid neural network”, IEEE Trans. Biomed. Eng., vol. 48, pp. 1265-1271, November 2001. However, most clustering algorithms used for ECG analysis are computationally rather complex and therefore unsuitable for implantable CRM devices. Furthermore, not much a priori morphologic information is associated with the various rhythms in the electrogram (EGM); this is in contrast to the more well-defined ECG.

Previous work in the area of intracardiac event classification mainly focused on discrimination of a specific condition in order to discern, e.g., atrial fibrillation from other atrial tachyarrythmias, see A. Schoenwald, A. Sahakian, and S. Swiryn, “Discrimination of atrial fibrillation from regular atrial rhythms by spatial precision of local activation direction”, IEEE Trans. Biomed. Eng., vol. 44, pp. 958-963, October 1997. Other applications involve discrimination of ventricular from supraventricular tachycardia, see L. Koyrakh, J. Gillberg, and N. Wood, “Wavelet based algorithms for EGM morphology discrimination for implantable ICDs”, in Proc. Of Comp. In Card. (Piscataway, N.J., USA), pp. 343-346, IEEE, IEEE Press, 1999, and G. Grönefeld, B. Schulte, S. Hohnloser, H.-J. Trappe, T. Korte, C. Stellbrink, W. Jung, M. Meesmann, D. Böcker, D. Grosse—Meininghaus, J. Vogt, and J. Neuzner, “Morphology discrimination: A beat-to-beat algorithm for the discrimination of ventricular from supraventricular tachycardia by implantable cardioverter defibrillators”, J. Pacing Clin. Electrophysiol., vol. 24, pp. 1519-1524, October 2001. More general classification algorithms, which in turn involve training on individual patients, have been based on analog neural networks or wavelet analysis for morphologic discrimination of arrhythmias.

PCT Application WO 97 39 681 describes a defibrillator control system comprising a pattern recognition system. The intracardiac electrogram signal is digitised and delivered for feature selection into a selector. The feature selector outputs selected features to a trained classifier to provide information as to what group the produced signal should be clustered, e.g. ventricular tachycardia. The classifier outputs the classified information for use for a therapeutic decision.

U.S. Pat. No. 5,271,411 discloses an ECG signal analysis and cardiac arrhythmia detection by extraction of features from a scalar signal. A QRS pattern vector is then transformed into features describing the QRS morphology, viz. a QRS feature vector. A normal QRS complex is identified based on the population of QRS complexes located within clusters of QRS features within a feature space having a number of dimensions equal to the number of extracted features. The extracted morphology information is then used for judging whether a heartbeat is normal or abnormal.

U.S. Pat. No. 5,638,823 describes non-invasively detecting of coronary artery diseases. A wavelet transform is performed on an acoustic signal representing one or more sound event caused by turbulence of blood flowing in an artery to provide parameters for a feature vector. This feature vector is used as one input to neural networks, the outputs of which represent a diagnosis of coronary stenosis in a patient.

In Michael A. Unser et al, “Wavelet Applications in Signal and Image Processing IV”, Proceedings SPIE—The International Society for Optical Engineering, 6-9 Aug. 1996, vol. 2825, part two of two parts, pp. 812-821, a wavelet packet based compression scheme for single lead ECGs is disclosed, including QRS clustering and grouping of heart beats of similar structures. For each heart beat detected, its QRS complex is compared to templates of previously established groups. Point-by-point differences are used as similarity measures. The current beat is assigned to the group whose template is most similar, provided predetermined conditions are satisfied. Otherwise a new group is created with the current QRS complex used as the initial group template.

SUMMARY OF THE INVENTION

An object of the present invention is to provide a technique for separation of cardiac rhythms in a reliable way on the basis of electrogram event clustering

The above object is achieved in accordance with the invention by an apparatus for analyzing cardiac events detected in electrogram (EGMs) having a feature extraction unit that derives features of the cardiac events for discriminating among different types of detected cardiac events, and a clustering unit that groups cardiac events with similar features into a cluster, defined by predetermined cluster features, wherein the feature extraction unit determines a feature vector that describes waveform characteristics of the cardiac event EGM signals by means of a wavelet transform, and wherein the clustering unit determines a distance or distances between the feature vector and corresponding cluster feature vectors in order to assign the cardiac event in question to the cluster that results in a minimum distance, as long as the minimum distance is less than a predetermined threshold value.

Certain arrhythmias are diagnosed immediately to give proper therapy, while, for others, it may be sufficient to record the rhythm for data collection purposes. Thus, the classification problem, viz. to label the rhythms based on clusters using clinical terms, may not always be necessary to implement.

In an embodiment of the apparatus according to the invention both morphologic and temporal data are considered for clustering. Morphologic features are efficiently extracted by use of the dyadic wavelet transform after which the events are grouped by a leader-follower clustering embodiment. The event detection problem, based on the same transform, is previously treated in Swedish patent application no. 0103562-5.

According to another advantageous embodiment of the apparatus according to the invention an integrator is provided to integrate said distance over a predetermined period of time. By integrating the distance over a period of time it is possible to distinguish irregular rhythms, like atria! fibrillation, from regular rhythms. The integral total distance in the case of atrial fibrillation will be high whereas regular rhythms will result in a lower total distance.

The invention also relates to a heart stimulator provided with the above-mentioned apparatus for controlling the therapeutic stimulation depending on arrhythmia detection.

DESCRIPTION OF THE DRAWINGS

FIGS. 1 a and 1 b illustrate impulse responses of a filter bank used for cardiac event detection in accordance with the present invention.

FIG. 2 is a flow chart of a clustering algorithm performed in the apparatus according to the present invention.

FIGS. 3 a and 3 b illustrate the computational complexity for different clustering algorithms.

FIG. 4 a illustrates clustering performance in terms of probability for correct clustering of an event, as well as the probability of a dominant event in the cluster, and FIG. 4 b illustrates a histogram spread of the number of clusters.

FIG. 5 a illustrates clustering performance as a function of a distance threshold, and FIG. 5 b illustrates the number of clusters as a function of the distance threshold.

FIG. 6 illustrates the clustering performance for one set of parameters.

FIGS. 7 a and 7 b are examples of the clustering result for a connected EGM for three cases.

FIG. 8 is a flowchart describing the general functioning of an embodiment of an apparatus according to the invention.

FIG. 9 is a block diagram of a heart stimulator providing with a cardiac event detecting apparatus in accordance with the present invention.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

P Wave Detection and Feature Extraction

A signal model assuming that the event waveform is composed of a linear combination of representative signals is considered. The feature extraction problem is then to estimate the individual components of the representative signals since each morphology will have its own linear combination. By using the dyadic wavelet transform, different widths of the two fundamental monophasic and biphasic waveforms are included in the model at a low cost.

Feature Extraction

It is assumed that the QRS waveform is composed of a linear combination of representative signals, H=[h₁ . . . h_(p)]  (2) where each function, h_(j), j=1, . . . , P, is of size M×1. The only restriction on H is that it must have full rank. Different morphologies, s(n), are modelled by the P×1 coefficient vector θ(n), with the linear model s(n)=Hθ(n)  (3) where n is a temporal variable describing when the event occurs. The observed signal, x(n), is assumed to be modelled by, x(n)=Hθ(n)+w(n)  (4) where the M×1 noise vector w(n) is assumed to be zero-mean, white, and Gaussian with variance $\sigma\quad{\frac{2}{\omega}.}$ Consequently, x(n) is defined as $\begin{matrix} {{x(n)} = \begin{bmatrix} {x(n)} \\ \vdots \\ {x\left( {n + M - 1} \right)} \end{bmatrix}} & (5) \end{matrix}$ implying an event duration of M samples, beginning at n. The probability density function x(n) for a specific realization of θ(n), p(x(n); θ(n)), is thus given by $\begin{matrix} {{p\left( {{x(n)};{\theta(n)}} \right)} = {\frac{1}{\left( {2{\Pi\sigma}\begin{matrix} 2 \\ \omega \end{matrix}} \right)\frac{M}{2}}{\exp\left\lbrack {{- \frac{1}{2\sigma\begin{matrix} 2 \\ \omega \end{matrix}}}\left( {{x(n)} - {H\quad{\theta(n)}}} \right)^{T}\left( {{x(n)} - {H\quad{\theta(n)}}} \right)} \right\rbrack}}} & (6) \end{matrix}$

In this model, a complete description of a QRS complex is provided by the deterministic unknown parameter vector θ(n). The absence of a QRS complex corresponds to the case when θ(n) is equal to 0, where 0 is the P×1 zero vector. In general, no a priori knowledge is available on θ(n), and therefore an estimate is required before detection can take place. Furthermore, only one event is assumed to take place within the observation interval 0≦n≦N.

Filterbank Representation

The descriptive functions in H have been selected such that the following three aspects have been taken into particular account:

-   -   1. the main morphologies of the QRS complex are mono- and/or         biphasic     -   2. the broad range of QRS complex durations, and     -   3. a low complexity implementation.

The wavelet transform is particularly suitable since it is a local transform, i.e., it provides information about the local behaviour of a signal. One wavelet decomposition method which may be efficiently implemented is the dyadic wavelet transform. By careful selection of the filters, a suitable filter bank including mono-and biphasic impulse responses can be obtained. A symmetric lowpass filter, f(n), is used repeatedly in order to achieve proper frequency bands. This filter is combined with one of two filters, g_(b)(n) or g_(m)(n), which together define the waveform morphology (the subindices b and m denote bi- and monophasic, respectively). For the biphasic case, the recursion is expressed as, $\begin{matrix} {{{h_{1,b}(n)} = {g_{b}(n)}}{{h_{2,b}(n)} = {{f(n)}*{g_{b}\left( {2n} \right)}}}{{h_{3,b}(n)} = {{f(n)}*{f\left( {2n} \right)}*{g_{b}\left( {4n} \right)}}}\vdots{{h_{q_{\max},b}(n)} = {{f(n)}*\ldots\quad*{\int{\left( {2^{q_{{mas}^{- 3}}}n} \right)*{g_{b}\left( {2^{q_{\max^{- 1}}}n} \right)}}}}}} & (7) \end{matrix}$ in which the subindex q_(max) represents the maximum (coarsest) scale.

It is now possible to present an expression for-H which is composed of one biphasic and one monophasic part, H=└{tilde over (H)}_(b) {tilde over (H)}_(m)┘  (8)

-   -   where the biphasic H_(b) is defined as $\begin{matrix}         {{H_{b} - \left\lbrack {h_{q_{\min},b}\quad\ldots\quad h_{q_{\max},b}} \right\rbrack} = \begin{bmatrix}         {h_{q_{\min},b}(0)} & \ldots & {h_{q_{\max},b}(0)} \\         \vdots & ⋰ & \vdots \\         {h_{q_{\min},b}\left( {M - 1} \right)} & \ldots & {h_{q_{\max},b}\left( {M - 1} \right)}         \end{bmatrix}} & (9)         \end{matrix}$         where the subindex q_(min) represents the minimum scale and         q_(min)<q_(max). The monophasic matrix H_(m) is computed in a         corresponding way. The reversed order of the columns in Ĥ,         denoted with {tilde over (H)} in (8), is introduced in order to         be consistent with the model assumed in (4).

In order to mimic the desired mono- and biphasic waveforms with a low complexity filter bank structure, short filters with small integer coefficients were used. In (7), the impulse response f(n) was chosen as a third order binomial function, $\begin{matrix} {{F(z)} = {{\left( {1 + z^{- 1}} \right)^{L_{\underset{\_}{\underset{\_}{L = 2}}}}1} + {3z^{- 1}} + {3z^{- 2}} + z^{- 3}}} & (10) \end{matrix}$

-   -   where F(z) is the Z-transform of f(n). For the biphasic filter         bank, the filter g_(b)(n) was selected as the first order         difference,         g _(b)(n)=[−1 1]  (11)

The filter g_(m)(n) was chosen such that a compromise between the requirement of a DC gain equal to zero and an approximately monophasic impulse response was achieved. A reduction of complexity results when g_(b) (n) is reused, g _(m)(n)=g _(b)(n)*g _(b)(n)=[1−2 1]  (12)

For this particular choice of g_(m)(n) and by using Mallat's algorithm, it is possible to calculate both the biphasic and the monophasic filter output from each scale by using f(n) once and g_(b)(n) twice. The filter bank impulse responses are shown in FIGS. 1 a and 1 b. The filter bank includes two orthogonal signal sets where the width of the signal varies within each set. In FIG. 1 a, h_(j,b) (n) is shown for j=2, . . . ,4 from top to bottom, and in FIG. 1 b, the corresponding h_(j,m)(n) are shown.

ML Parameter Estimation

The unknown coefficient vector θ(n) can be estimated by using the maximum likelihood criterion according to, $\begin{matrix} {{\hat{\theta}(n)} = {\arg_{\begin{matrix} \max \\ {\theta{(n)}} \end{matrix}}{p\left( {{x(n)};{\theta(n)}} \right)}}} & (13) \end{matrix}$

However, {circumflex over (θ)}(n) is only of interest for those n for which the probability of an event, or, equivalently, for which the likelihood ratio test function T(x(n)) is maximized, $\begin{matrix} {\hat{n} = {\arg_{\begin{matrix} \max \\ n \end{matrix}}{T\left( {x(n)} \right)}}} & (14) \end{matrix}$

For this case, T (x(n)) can be shown to be [14], T(x(n))=x(n)^(T) H(H ^(T) H)⁻¹ H ^(T) x(n)  (15)

-   -   for the case when σ_(ω) ² is assumed to be constant.

The optimal estimate {circumflex over (θ)} for the detected event at {circumflex over (n)} is thus expressed as $\begin{matrix} {\hat{\theta} = {\arg_{\begin{matrix} \max \\ {\theta{(\hat{n})}} \end{matrix}}{p\left( {{x\left( \hat{n} \right)};{\theta\left( \hat{n} \right)}} \right)}}} & (16) \end{matrix}$

The MLE of θ({circumflex over (n)})(n) is found by maximizing p(x({circumflex over (n)});θ({circumflex over (n)}), or equivalently minimizing the MSE, (x({circumflex over (n)})−Hθ({circumflex over (n)}))^(T)(x({circumflex over (n)})−Hθ({circumflex over (n)}))=x({circumflex over (n)})^(T) x({circumflex over (n)})+θ({circumflex over (n)})^(T) H ^(T) Hθ({circumflex over (n)})−2θ({circumflex over (n)})^(T) H ^(T) x({circumflex over (n)})  (17)

Derivation with respect to x({circumflex over (n)}) yields the optimum θ, {circumflex over (θ)}({circumflex over (n)})=(H ^(T) H)⁻¹ H ^(T) x({circumflex over (n)})=(H ^(T) H)⁻¹ H ^(T) x({circumflex over (n)})  (18)

By using the above formulation, it is also possible to derive a generalized likelihood ratiodetector based on (15).

Rate

The central parameter for classification of cardiac arrhythmias is the heart rate. Most arrhythmias are defined in terms of heart rate, although sometimes with rather fuzzy limits. Consequently, rate should be considered in order to improve performance. The RR interval, Δt, is defined as the duration between two consecutive events, Δ{circumflex over (t)} _(k)=({circumflex over (n)} _(k) −{circumflex over (n)} _(k-1))Ts  (19) where {circumflex over (n)}_(k) and {circumflex over (n)}_(k-1) denote the occurrence times of the events, and T_(S) denotes the sampling period. Leader-Follower Clustering

The choice of leader-follower clustering is based on a number of features which makes it suitable for the present invention, viz.

-   -   on-line processing (non-iterative), and     -   self-learning, i.e., no a priori knowledge of the of clusters is         needed.

The starting point is the assumption that an event is present for which it should be decided whether it belongs to an already existing cluster or if a new cluster should be initiated. Since, no knowledge is a priori available on which rhythms or morphologies to be expected, the chosen algorithm must be self-learning. The leader-follower clustering algorithm is constituted by four quantities:

-   -   1. The event parameter vector, φ(n_(k)), containing the features         of the k:th event that occurs at time n_(k).     -   2. The cluster center μ_(i), and the covariance matrix Σ_(i);         that together define the i:th cluster. Since both μ_(i) and         Σ_(i) are unknown, they are replaced by their estimates         {circumflex over (μ)}_(i) and {circumflex over (Σ)}_(i);         respectivey.     -   3. The metric, d_(i) ², determine the distance between φ(n_(k))         and μ_(i); according to some suitable function.     -   4. A rule for adaptation of the cluster parameters for the         winning cluster.         Initialization of New Clusters

During run-time, a finite number of clusters exists which represent the rhythms having appeared until present time. Thus, it is occasionally necessary to initialize a new cluster when the existing ones do not sufficiently well fit the present event. When the distance function d_(j) ² exceeds a certain threshold, η, it is more likely that the event belongs to a new cluster than to any of the existing clusters. The selection of η is a tradeoff between cluster size and cluster resolution in the sense that choosing a small η will result in many clusters with few clustering errors. On the other hand, a large η results in few clusters but in more errors.

The minimum distance between φ(n) and μ_(i); with respect to both i and n is $\begin{matrix} {{{d\begin{matrix} 2 \\ \min \end{matrix}} = {{\underset{i,_{l}}{\min^{d_{i}^{2}}}{(1)\quad i}} = l}},\ldots\quad,{{l\quad l} = {{\hat{n}}_{k} - \frac{K}{2}}},\ldots\quad,{{\hat{n}}_{k} + \frac{K}{2}}} & (20) \end{matrix}$

-   -   over the search interval K. The corresponding minimum distance         indices are found as $\begin{matrix}         {\left\lbrack {i_{\min},n_{\min}} \right\rbrack = {\arg_{\begin{matrix}         \min \\         {i,l}         \end{matrix}}{d_{j}^{2}(l)}}} & (21)         \end{matrix}$

The decision rule based on the comparison of d_(min) ² and η is expressed as $\begin{matrix} {d_{\min^{2}}\left\{ \begin{matrix} {\rangle\eta\text{:}\quad{Initialize}\quad{new}\quad{cluster}} \\ \left\langle {\eta\text{:}\quad{Assign}\quad{to}\quad{winning}\quad{cluster}} \right. \\ \left\langle {{\rho\eta}\text{:}\quad{Update}\quad{winning}\quad{cluster}} \right. \end{matrix} \right.} & (22) \end{matrix}$

-   -   where 0<p<1. When the upper relation in (22) holds, a new         cluster is initialized by first increasing the number of         clusters by one, I=I+1, given that I<I_(max) where I_(max) is         the maximum number of clusters, and then initializing the new         cluster as,         {circumflex over (μ)}₁=Φ({circumflex over (n)} _(k))  (23)

On the other hand, if I=I_(max) the algorithm needs to discard one of the existing clusters. This can be done by elimination of, e.g., the oldest or the “smallest” cluster. By the term “smallest” cluster is meant that cluster which most rarely is fitted with a detected cardiac event.

For the middle and lower conditions in (22), the existing minimum distance cluster i_(min) is selected as the winning cluster. However, only the lower condition results in a cluster parameter update. The reason for including such a distinction is that only closely similar events should be used for cluster updates in order to reduce contamination.

Mahalanobis Distance Function

The distance between the feature vector θ(n) and each cluster {circumflex over (μ)}₁, is defined as the Mahalanobis distance, which is a normalized Euclidean distance in the sense that it projects the parameter vector elements onto univariate dimensions by including the inverse covariance matrix Σ_(i) ⁻¹. Thus, a feature with a larger variance in φ(n) will be assigned a larger share of the hyperspace before normalization compared to that with a lower variance. A consequence of normalization is that the Mahalanobis distance works well on correlated data since Σ_(i) ⁻¹ then acts as a decorrelator.

When searching for the minimum distance, a grid search over n is performed. This grid search is necessary since it not only minimizes the distance but also results in a more accurate fiducial point estimate than what would be the case when only considering T(x({circumflex over (n)}_(k))). The minimum distance is thus found by a grid search with respect to all existing clusters, i=1, . . . , I, and all feature vectors within the duration of an event I as defined in (20), $\begin{matrix} {{d_{i}^{2}(l)} = {\left( {{\Phi(l)} - {\hat{\mu}}_{i}} \right)^{T}{{\hat{\Sigma}}_{i}^{- 1}\left( {{\Phi(l)} - {\hat{\mu}}_{i}} \right)}}} & (24) \end{matrix}$ Reference Feature Adaptation

In order to track changes in the features of the different rhythms, adaptation of both μ_(i min) (k) and Σ̂_(i  min )⁻¹(k) are desirable, here the event index k has been included for clarity. For {circumflex over (μ)}_(imin) (k), an exponentially updated average is used: μ_(i) _(min) (k)={circumflex over (μ)}_(i) _(min) (k−1)+γε(k)  (25) where ε(k)=Φ(n _(min))−{circumflex over (μ)}_(i) _(,om) (k−1)  (26)

The exponential update factor γ is confined to the interval 0<γ<1.

The inverse covariance matrix, Σ̂_(i  min )⁻¹(k), is estimated by exponential averaging of the new cluster difference matrix ε(k)ε^(T)(k) using the update factor (1−α), $\begin{matrix} {{{\hat{\Sigma}}_{i_{\min}}^{- 1}(k)} = \left( {{(\alpha)\hat{\Sigma}{i_{\min}\left( {k - 1} \right)}} + {\left( {1 - \alpha} \right){ɛ(k)}{ɛ^{T}(k)}}} \right)^{- 1}} & (27) \end{matrix}$

Using the matrix inversion lemma A=B ⁻¹ +CD ⁻¹ C ^(T)  (28) A ⁻¹ =B−BC(D+C ^(T) BC)⁻¹ C ^(T) B  (29) and pairing the terms in (27) with the ones in (28), A={circumflex over (Σ)} _(imin)(k)  (30) B ⁻¹α{circumflex over (Σ)}_(imin)(k−1)  (31) C=ε(k)  (32) D ⁻¹=(1−α)  (33) The inverse {circumflex over (Σ)}_(i min) ⁻¹ (k−1) in (27) may be computed without any matrix inversions as, $\begin{matrix} {{{\hat{\Sigma}}_{i_{\min}}^{- 1}(k)} = {{\alpha^{- 1}{{\hat{\Sigma}}_{i_{\min}}^{- 1}\left( {k - 1} \right)}} - \frac{\alpha^{- 1}{{\hat{\Sigma}}_{i_{\min}}^{- 1}\left( {k - 1} \right)}{ɛ(k)}{ɛ^{T}(k)}\alpha^{- 1}{{\hat{\Sigma}}_{i_{\min}}^{- 1}\left( {k - 1} \right)}}{\left( {1 - \alpha} \right)^{- 1} + {{ɛ^{T}(k)}\alpha^{- 1}{{\hat{\Sigma}}_{i_{\min}}^{- 1}\left( {k - 1} \right)}{ɛ(k)}}}}} & (34) \end{matrix}$

By utilizing the matrix inversion lemma, the computational complexity of the operation is reduced from O(P³) to O(P²).

Algorithm Initialization

In leader-follower clustering, clusters are initialized as they become needed. This feature is convenient since it does not automatically introduce any unused clusters. However, it also puts demands on the algorithm to be able to create new clusters when necessary, and also to terminate clusters either not used for long or with only a few events. Initially, the total number of clusters, I, is equal to one. The algorithm is initialized by assigning the parameter vector Φ({circumflex over (n)}_(i)), which maximizes the test statistic in (14) of the first event, to the first cluster, cf. (23), {circumflex over (μ)}₁=Φ({circumflex over (n)} ₁)  (35)

For the general case, Φ({circumflex over (n)}_(k)) is composed of a subset of the representative functions together with the preceding RR-interval, $\begin{matrix} {{\Phi\left( {\hat{n}}_{k} \right)} = \begin{bmatrix} \frac{{\hat{\theta}}_{s}\left( {\hat{n}}_{k} \right)}{{{\hat{\theta}}_{s}\left( {\hat{n}}_{k} \right)}} \\ {\Delta\quad t_{k}} \end{bmatrix}} & (36) \end{matrix}$

-   -   where {circumflex over (θ)}_(s)({circumflex over (π)}_(k)) is a         subset of the most discriminating elements in {circumflex over         (θ)}_(s)({circumflex over (π)}_(k)).

It should be noted that an event is defined by its depolarization wave. Consequently, Δt is not included in the arrival time estimation of {circumflex over (n)}_(k), but instead computed afterwards. The time continuous notation of Δt_(k) is preferred since it results in a suitable magnitude similar to the normalized morphological information in θ_(s).

The inverse correlation matrix estimate is initialized in the same way for all clusters; a simple solution is to set it equal to a scaling of the identity matrix I, $\begin{matrix} {{\hat{\Sigma}}_{i}^{- 1} = {\delta\quad I}} & (37) \end{matrix}$

-   -   where σ is a design parameter. The complete clustering algorithm         for organized events is presented in the flow chart in FIG. 2.         Reduced-Complexity Clustering

In order to develop a more efficient algorithm in terms of performance versus power consumption and to evaluate the power consumption itself, a simple measure of the computational complexity can be used, namely, the total number of multiplications. This quantity represents a much more complex operation than do additions. In this algorithm, where most operations are of the nature “multiply-accumulate”, the number of additions is of the same order as the number of multiplications and may thus be neglected without significant loss of accuracy.

In order to reduce the computational complexity, focus is put on reducing the number of multiplications. The dominant contributions of multiplication operations are found in (24) and (34) which require P(P+1) and P/2 (3P+5) multiplications, respectively, considering certain symmetry properties of Σ̂_(i)⁻¹. Furthermore, one division is required in (34). However, according to (20), (24) is performed IK times per event while (34) is performed only once per event.

Based on the above performance figures, a few approximations can be identified:

-   -   to use only the peak(s) in T(x({circumflex over (n)})) instead         of a complete grid search,     -   to use a simplified $\overset{- 1}{\hat{\sum\limits_{i,}}},$     -    and     -   to use a likelihood based search sequence over the clusters,         i.e., to start with the most likely cluster and to stop the         search if a sufficiently small distance is found.

Simplifying the grid search from spanning both samples and clusters in (20) to span over only clusters, $\begin{matrix} {d_{\min}^{2} = {\min\limits_{i}{d_{i}^{2}\left( \hat{n} \right)}}} & (38) \end{matrix}$

-   -   the number of multiplications may be reduced by a factor K to         IP(P+1). Due to sensitivity in T(x({circumflex over (n)})), the         feature vectors resulting in the peaks for two different events         may differ significantly, this simplification is likely to         result in more clusters. A useful compromise may instead be to         use the coefficients from, e.g., the 3 largest peaks in         T(x({circumflex over (n)})) resulting in 3IP(P+1)         multiplications per event.

Since a cardiac event lasts for longer time than one sample those samples which give the filter coefficients which are most similar to the cluster reference are determined. A grid search over 40 msec is preferably made. In this way coefficients of greatest importance locally are determined. For this decision T(x(n)) is considered and samples generating a peak are chosen, i.e. T(x(n−1))<T(x(n))>T(x(n+1)), since they indicate the probability for the presence of an event.

Another simplification, based on the approximation of orthogonality between the different wavelet scales as well as the RR interval, is to simplify $\overset{- 1}{\hat{\sum\limits_{i}}}$ by only including its diagonal elements in the adaptation, $\begin{matrix} {\overset{- 1}{\hat{\sum\limits_{i\quad\min}}}{\approx {\alpha^{- 1}{\overset{- 1}{\hat{\sum\limits_{i\quad\min}}}{{+ \left( {I - \alpha} \right)}{{diag}\left( {{ɛ(k)}{ɛ^{T}(k)}} \right)}}}}}} & (39) \end{matrix}$ where diag(A) returns the diagonal elements of the square matrix A. By using this approximation, the number of multiplications used for the estimation of $\overset{- 1}{\hat{\sum\limits_{i}}}$ is reduced to 3P. Additionally, the distance computation in (20) is simplified and may be reduced to 2IKP multiplications per event.

A reasonable assumption is often that successive events originate from the same rhythms. Considering this knowledge, it would be sufficient to compute the distances for the previously selected cluster. In doing so, the number of multiplications in (38) are reduced even further.

Table 1 presents the different detector versions as defined by their distinguishing features and shows computational complexity for the different versions of the clustering algorithm. Features 1 Version Σ̂_(i    min )⁻¹ Search alg. Complexity C A Full Interval ${{IKP}\left( {P + 1} \right)} + {\frac{P}{2}\left( {{3P} + 5} \right)}$ B Diagonal Interval 2IKP+ 3P C Full 3 peak ${3{{IP}\left( {P + 1} \right)}} + {\frac{P}{2}\left( {{3P} + 5} \right)}$ D Diagonal 3 peak 6IP+ 3P

The total computational complexity, C, as reflected by the number of multiplications for the different algorithm versions, is presented in FIG. 3 as a function of 1. In FIG. 3 the solid line shows an algorithm version A, dashed line a version B, dotted line a version C, and a version D by dash-dotted line), for a feature vector with (a), P=4, and (b), P=7 elements.

Results

The results are obtained by studying the performance of the following quantities:

-   -   algorithm versions, as presented in Table 1, and     -   noise tolerance, for noise-free signals and for signals with         background noise of 50 μV RMS.

The following parameter settings have been used (unless otherwise stated), α=1.05 γ=0.025 δ=50 K=40  (49)

It is noted that α⁻¹ is chosen to offer faster adaption than γ. The reason for that is that the initial estimate $\overset{- 1}{\hat{\sum\limits_{i}}}$ is likely to be less accurate than the initial estimate {circumflex over (μ)}_(i). Also, δ is chosen to have the same order of magnitude as the steady state eigenvalues $\overset{- 1}{\hat{\sum\limits_{i}}}.$

It should be noted that the different algorithm versions are not fully comparable in terms of performance for a specific η due to the differences in distance computation. In versions B and D, where a diagonal $\overset{- 1}{\hat{\sum\limits_{i\quad\min}}}$ is used, the lack of non-diagonal information results in a nonorthogonal distance which is larger than the orthogonal one. Since versions C and D make use of a limited search, the minimum distance found may differ from the global minimum distance for the event. Both these algorithmic differences imply an increase in clustering quality for a certain value of η, however, at the expense of more introduced clusters. Evaluation Measures

The two main quantities evaluated are clustering performance and computational complexity, see FIG. 3; these two quantities are in general conflicting. In the evaluation, a cluster is assigned to that cardiac rhythm which contains the most events in the cluster. This rhythm is denoted as the dominant rhythm within the cluster. With respect to the dominant rhythm, the cluster is defined as a correct cluster, whereas it is erroneous to all other rhythms. If a rhythm is found to be dominant in more than one cluster, such clusters are first merged in the performance evaluation. The number of events in the correct cluster which belongs to the i:th dominant rhythm is equal to N_(D)(i), while the number of events belonging to any other false rhythm in the cluster is equal to N_(F)(i). The number of events of the dominant rhythm which are not classified in a correct cluster, i.e., missing, is equal to N_(M)(i).

The performance of the algorithm is evaluated in terms of probability of a correct clustering of an event, P_(CC)(i), and probability of a dominant event in a cluster, P_(DE)(i). The first parameter is expressed as the share of correctly clustered events within a rhythm and is, using the above parameters, defined by $\begin{matrix} {{{Pcc}(i)} = \frac{N_{D}(i)}{{N_{D}(i)} + {N_{M}(i)}}} & (41) \end{matrix}$

The second parameter may be expressed as the share of dominant rhythm events within a cluster, and is defined as $\begin{matrix} {{P_{DE}(i)} = \frac{N_{D}(i)}{{N_{D}(i)} + {N_{M}(i)}}} & (42) \end{matrix}$

For the case when a rhythm completely lacks a correct cluster, P_(DE)(i) is undefined; the rhythm is then excluded from subsequent statistical computations. Averaging P_(CC)(i) and P_(DE)(i) over all clusters results in the global performance measures P_(CC) and P_(DE), respectively.

It should be pointed out that P_(CC)(i) and P_(DE)(i) reach their maximal value of 1 when η is sufficiently small such that the number of clusters equals the total number of beats evaluated. This is, of course, a highly undesirable solution although performance, as expressed in (41) and (42), will be excellent. For this reason, the total number of clusters, I, is a crucial parameter to be considered. Here, the average I is used together with the minimum and maximum number of clusters for a case, I_(min), and I_(max), respectively.

The power consumption of the algorithm is an important parameter which determines the pacemaker life span. In this study, power consumption is approximated by the computational complexity defined above as the total number of multiplications. As shown, the computational complexity depends mainly on three parameters: the feature vector length, P, the search interval, K and the cluster size, I.

Noise-Free Signal Clustering Performance

FIGS. 4 a and 4 g illustrate clustering performance in terms of P_(DE) and P_(CC), see FIG. 4 a, and I_(min), {overscore (I)} and I_(max), see FIG. 4 (b). Thus in FIG. 4 a clustering performance is shown as depending on {overscore (I)} for P_(CC) (dark bars) and P_(DE) (bright bars) for noise-free signals. In FIG. 4 b, the spread between the different cases is shown as, from left to right, I_(min), I and I_(max). The presented algorithm versions are found in Table 1 and three values of I have been chosen for comparison; 3, 4 and 5. Versions A and B perform similarly for all three cases, and achieve P_(DE)=1 and P_(CC)=1 for {overscore (I)}=4 and {overscore (I)}=5, respectively, by creating an acceptable number of clusters. However, it can be seen from FIG. 4 b that a large difference in the number of initialized clusters between different cases is present. For version B, a slight increase in both {overscore (I)} and I_(max) is observed for {overscore (I)}=4. However, this increase is more due to an unfortunate step-like behaviour in the results for the given {overscore (I)} than for any significant decrease in performance. The values of η used in FIGS. 4 a and 4 b for the different versions are shown in Table 2. TABLE 2 Values of η used in FIG. 4. Algorithm version {overscore (I)} A B C D 5 3.6 4.2 5.4 5.8 4 4.6 4.8 7.0 7.2 3 9.6 9.6 12.0 10.8

Versions C and D perform slightly worse than do versions A and B. Contrary to the latter ones, neither version achieves P_(DE)=1 or P_(CC)=1 for the presented {overscore (I)}, see FIG. 4 a.

FIG. 5 a shows the clustering performance for version A in detail and its dependence on η. The most noteworthy result is that both P_(DE)=1 and P_(CC)=1 for η<7. It is also clear that clustering performance deteriorates rapidly for η>10. The increase in P_(CC) for very high values of η is due to that not all rhythms are allocated to a dominant cluster and are therefore disregarded in the performance computation.

In FIG. 5 b, I_(min), {overscore (I)} and I_(max) are presented as a function of the clustering threshold. Thus FIG. 5 a shows clustering performance in terms of P_(DE) (solid line) and P_(CC) (dashed line) as a function of η for noise-free signals, and FIG. 5(b) the corresponding I_(min), {overscore (I)} and I_(max). For η<3, the number of initiated clusters increases rapidly with decreasing rI. Within the interval 3<η<7, all rhythms initiate at least one cluster, while for η>7, this is not true for all cases. Removing the “worst case”, the above is true for the other cases for η<10.

The clustering performance is exemplified for one set of parameters in FIG. 6 using η=7. Correct clustering with minimal number of clusters is achieved for two cases while an extra cluster is initialized for two cases. The reason for the extra cluster is that, for case 2, the temporal search interval is chosen too small for the third morphology from the left resulting in an extra cluster. For case 5 an actual difference in morphologies both on the up and down slopes of the dominant peak is discernible between the first and fourth clusters from the left.

Clustering for a concatenated electrogram is presented for case 3 in FIG. 7, where three distinct rhythm classes can be discerned, viz. normal sinus rhythm followed by supraventricular tachycardia and atrial flutter. The EGM is shown together with the clusters, represented by o, x and +, respectively, assigned for each event. The different rhythms result in three clusters.

FIG. 8 is a flow chart illustrating the function in broad outline of an embodiment of the apparatus according to the invention. At block 2 cardiac event features are extracted in the form of wavelet coefficients, and the event is detected, at 4, 6. At block 8 is checked whether the detected event is member of a labelled cluster. If so, the event is added to a class, at 10, and actions associated with that class are performed, at 12.

If the event is not member of a labelled class it is checked if it is a member of an existing cluster, at 14. If so, the event is added to a class, at 16, and it is checked if it is possible to label the cluster, 18, and if so the cluster is labelled, at 20.

If the event is not a member of an existing cluster, block 14, a new cluster is created as described above fitted to the detected cardiac event, at 22.

Thus according to the invention clustering events in the EGM is performed for use in implantable CRM devices, like heart stimulators. The invention is based on feature extraction in the wavelet domain whereupon the features are clustered based on the Mahalanobis distance criterion. According to advantageous embodiments of the apparatus according to the invention simplifications of the technique is proposed in order to reduce computational complexity to obtain a more implementationally feasible solution.

If the apparatus according to the invention is to be used for longer periods of data analysis, large clusters, although old, may be desirable to be kept in some way, while the oldest cluster may be selected to be removed if the application is based on shorter time frames. Also, due to the short data lengths, testing of such algorithms would be of limited value.

By combining detector/clusterer with labelling rules of a classifier a complete detector/classifier is obtained with the possibility to more accurate therapy.

The labelling need not be done in real time and probably more than one event will be needed to label a cluster. Once the cluster has been labelled using clinical terms, the actions associated with the particular class will be carried out immediately, i.e. in real time. Thus, the rules needed to label the cluster are not used in identifying the event itself.

The rules used to label the clusters are based on characteristics of the different possible events. Instead of the exact rules, the characteristics are consequently described.

FIG. 9 is a block diagram of an embodiment of an implantable heart stimulator provided with the apparatus according to the invention. Electrodes 30, 32 implanted in the heart 34 of a patient are connected by a lead 36 to an AID converter 40, via a switch 38 serving as overvoltage protection for the A/D converter 40. In the A/D converter 40 the signal is A/D converted and the digital signal is supplied to a wavelet detector 42.

The detector 42 decides whether a cardiac event is present or not as described earlier. Wavelet coefficients are calculated as well. Parameters of the detector 42 are programmable from the stimulator microprocessor 44. At the detection the coefficients and the RR information are forwarded to the clusterer 46 in which it is determined to which cluster the detected cardiac event belongs, as described previously. The clusterer 46 is preferably of a leader-follower type and also the cluster parameters are programmable from the microprocessor 44.

By the microprocessor 44 suitable therapy is decided depending on assigned cluster for the detected event and possible a priori knowledge about arrhythmia associated with the cluster in question. Thus it is possible to distinguish e.g. ventricular tachycardia from a sinus tachycardia by comparing the parameters with a known normal sinus rhythm. Parameters of the sinus tachycardia are then supposed to be similar to those of a normal sinus rhythm, whereas parameters of ventricular tachycardia differ significantly.

As an alternative the decision rules can be trained from a number of rhythms and the resulting rules are then used on test data, see Weichao Xu et al., “New Bayesian Discriminator for Detection of Atrial Tachyarrhythmias”, DOI:10.1161/01.CIR.0000012349.14270.54, pp. 1472-1479, January 2002. It is then possible to decide that a certain position of the cluster indicates e.g. a sinus rhythm, etc. Also this technique can be based on analysis of the feature vector for a cluster, and it is possible to decide if the beat is broad or narrow, large or small, or if the rhythm is regular or irregular.

The stimulator in FIG. 9 also includes a pulse unit 48 with associated battery 50 for delivery of stimulation pulses to the patient's heart 34 depending on the clustering evaluation.

The implantable stimulator shown in FIG. 9 includes telemetry means 52 with antenna 54 for communication with external equipment, like a programmer.

Although modifications and changes may be suggested by those skilled in the art, it is the intention of the inventor to embody within the patent warranted hereon all changes and modifications as reasonably and properly come within the scope of his contribution to the art. 

1-13. (canceled)
 14. An apparatus for analyzing cardiac events, comprising: a feature extraction unit, supplied with an electrocardiogram, said feature extraction unit deriving features of cardiac events from said electrocardiogram and determining a feature vector, by a wavelet transform operating on said electrogram, describing waveform characteristics of cardiac events in said electrogram; and a clustering unit provided with said feature vector, said clustering unit determining a distance between said feature vector and corresponding cluster feature vectors and assigning a cardiac event in said electrogram to a particular cluster that results in a minimum distance, as long as said minimum distance is less than a predetermined threshold value.
 15. An apparatus as claimed in claim 14 wherein each of said clusters is defined by a cluster center μ_(i) and a co-variance matrix Σ_(i) for the cluster features of that cluster, and wherein said clustering unit determines a distance function D_(i) ² between each event feature vector and said cluster μ_(i).
 16. An apparatus as claimed in claim 15 wherein said clustering unit calculates said distance using Mahalanobis distance criterion.
 17. An apparatus as claimed in claim 15 wherein said clustering unit determines said minimum distance by a grid search over a duration of the cardiac event.
 18. An apparatus as claimed in claim 14 comprising an integrator that integrates said distance over a predetermined period of time.
 19. An apparatus as claimed in claim 14 wherein said clustering unit updates said cluster feature according to a predetermined rule dependent on said minimum distance.
 20. An apparatus as claimed in claim 14 wherein said clustering unit generates a new cluster if said minimum distance exceeds said predetermined threshold value, by setting features for said new cluster equal to the event features of the cardiac event that resulted in said minimum distance exceeding said predetermined threshold value.
 21. An apparatus as claimed in claim 14 wherein said clustering unit terminates clusters that fail to have a predetermined number of cardiac events grouped therein within a predetermined time period.
 22. An apparatus as claimed in claim 14 wherein said clustering unit performs a likelihood-based search sequence over said clusters to determine said minimum distance.
 23. An apparatus as claimed in claim 14 wherein said clustering unit determines said minimum distance by a grid search only over clusters in which cardiac events have been grouped within a duration of the cardiac event.
 24. An apparatus as claimed in claim 14 wherein said clustering unit calculates a distance of a cardiac event in question from a cluster in which a cardiac event was previously grouped.
 25. An apparatus as claimed in claim 14 comprising a classifier that associates said clusters respectively with different cardiac rhythms according to predetermined rules.
 26. A heart stimulator comprising: a pulse generator adapted to interact with a subject to deliver stimulation pulses to the subject; an apparatus for analyzing cardiac events comprising a feature extraction unit, supplied with an electrocardiogram, said feature extraction unit deriving features of cardiac events from said electrocardiogram and determining a feature vector, by a wavelet transform operating on said electrogram, describing waveform characteristics of cardiac events in said electrogram, and a clustering unit provided with said feature vector, said clustering unit determining a distance between said feature vector and corresponding cluster feature vectors and assigning a cardiac event in said electrogram to a particular cluster that results in a minimum distance, as long as said minimum distance is less than a predetermined threshold value; and an arrhythmia detection and control unit connected to said pulse generator for controlling emission of stimulation pulses from said pulse generator dependent on detection of an arrhythmia dependent on the cluster in which the cardiac event is grouped. 